It contains monthly returns for five value-weighted portfolios from October 1969 to September 2019. The portfolios are formed using all stocks traded at NYSE, NASDAQ, and AMEX. A proxy for the level of operating profitability (OP) is computed for each stock. Then, the stocks are sorted according to OP and are split evenly into 5 groups (quintiles).
LO contains the lowest OP quintile stocks whereas portfolio HI contains the highest OP quintile stocks. Portfolios QNT2, QNT3, and QNT4 are formed in the obvious fashion using the remaining stocks. The provided file also contains market returns (MKT)2 and T-bill rates (RF). Note that all returns are expressed in percentage points.
# setting working directory
setwd("~/Documents/Coursework/COEC371/problem_sets/MarkedAssignment")
# importing libraries
library(tidyverse)
## ── Attaching packages ─────────────────────────────── tidyverse 1.2.1 ──
## ✔ ggplot2 3.2.1 ✔ purrr 0.3.3
## ✔ tibble 2.1.3 ✔ dplyr 0.8.3
## ✔ tidyr 1.0.0 ✔ stringr 1.4.0
## ✔ readr 1.3.1 ✔ forcats 0.4.0
## Warning: package 'ggplot2' was built under R version 3.5.2
## Warning: package 'tibble' was built under R version 3.5.2
## Warning: package 'tidyr' was built under R version 3.5.2
## Warning: package 'purrr' was built under R version 3.5.2
## Warning: package 'dplyr' was built under R version 3.5.2
## Warning: package 'stringr' was built under R version 3.5.2
## Warning: package 'forcats' was built under R version 3.5.2
## ── Conflicts ────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
library(readxl)
## Warning: package 'readxl' was built under R version 3.5.2
library(stats)
library(plotly)
##
## Attaching package: 'plotly'
## The following object is masked from 'package:ggplot2':
##
## last_plot
## The following object is masked from 'package:stats':
##
## filter
## The following object is masked from 'package:graphics':
##
## layout
library(ggthemes)
## Warning: package 'ggthemes' was built under R version 3.5.2
Importing data from VW_5_OP_Mkt_Rf_Monthly_196910_201909.xlsx
data =read_excel('VW_5_OP_Mkt_Rf_Monthly_196910_201909.xlsx')
# calculating the risk premium
data <- data %>% mutate(RP = (MKT) - RF)
summary(data)
## Date LO QNT2 QNT3
## Min. :196910 Min. :-28.4700 Min. :-21.970 Min. :-21.0900
## 1st Qu.:198204 1st Qu.: -2.4450 1st Qu.: -1.627 1st Qu.: -1.5675
## Median :199410 Median : 1.0450 Median : 1.245 Median : 1.2450
## Mean :199432 Mean : 0.7295 Mean : 0.884 Mean : 0.9728
## 3rd Qu.:200703 3rd Qu.: 4.1500 3rd Qu.: 3.817 3rd Qu.: 3.5200
## Max. :201909 Max. : 17.6200 Max. : 18.680 Max. : 17.5900
## QNT4 HI MKT RF
## Min. :-20.2400 Min. :-24.000 Min. :-22.6400 Min. :0.0000
## 1st Qu.: -1.6125 1st Qu.: -1.532 1st Qu.: -1.7500 1st Qu.:0.1400
## Median : 1.2600 Median : 1.195 Median : 1.2650 Median :0.4000
## Mean : 0.9658 Mean : 1.031 Mean : 0.9275 Mean :0.3812
## 3rd Qu.: 3.8175 3rd Qu.: 3.922 3rd Qu.: 3.8975 3rd Qu.:0.5400
## Max. : 18.6500 Max. : 16.880 Max. : 16.6100 Max. :1.3500
## RP
## Min. :-23.2400
## 1st Qu.: -2.0250
## Median : 0.9200
## Mean : 0.5463
## 3rd Qu.: 3.5050
## Max. : 16.1000
str(data)
## Classes 'tbl_df', 'tbl' and 'data.frame': 600 obs. of 9 variables:
## $ Date: num 196910 196911 196912 197001 197002 ...
## $ LO : num 6.25 -4.77 -2.4 -6.84 7.26 ...
## $ QNT2: num 4.61 -2.98 -3.47 -7.13 8.54 0.64 -9.37 -4.67 -5.93 8.11 ...
## $ QNT3: num 4.49 -4.18 -1.3 -7.88 5.32 0.12 -9.56 -4.51 -4.33 10.2 ...
## $ QNT4: num 6.26 -3.74 -2.07 -7.15 5.01 ...
## $ HI : num 6.21 -2.17 -0.33 -7.91 4.08 ...
## $ MKT : num 5.66 -3.27 -1.99 -7.5 5.75 -0.49 -10.5 -6.39 -5.21 7.45 ...
## $ RF : num 0.6 0.52 0.64 0.6 0.62 0.57 0.5 0.53 0.58 0.52 ...
## $ RP : num 5.06 -3.79 -2.63 -8.1 5.13 -1.06 -11 -6.92 -5.79 6.93 ...
to perform Regression Analysis for all five OP portfolios. regress portfolio excess returns on market excess returns.
What are the alpha estimates for the 5 OP portfolios and the standard errors of these estimates?
# attempt to regress
reg1 = data %>% lm(formula = (LO-RF) ~ RP)
reg2 = data %>% lm(formula = (QNT2-RF) ~ RP)
reg3 = data %>% lm(formula = ((QNT3)-RF) ~ RP)
reg4 = data %>% lm(formula = ((QNT4)-RF) ~ RP)
reg5 = data %>% lm(formula = ((HI)-RF) ~ RP)
#create a table
stock <- c('LO','QNT2','QNT3','QNT4','HI')
betas <- c(reg1$coefficients[2],
reg2$coefficients[2],
reg3$coefficients[2],
reg4$coefficients[2],
reg5$coefficients[2])
alphas <- c(reg1$coefficients[1],
reg2$coefficients[1],
reg3$coefficients[1],
reg4$coefficients[1],
reg5$coefficients[1])
df <- tibble(stock,alphas,betas)
df
## # A tibble: 5 x 3
## stock alphas betas
## <chr> <dbl> <dbl>
## 1 LO -0.299 1.19
## 2 QNT2 -0.0371 0.988
## 3 QNT3 0.0699 0.955
## 4 QNT4 0.0511 0.977
## 5 HI 0.125 0.960
according betas, LO should have the highest return and QNT3 should have the lowest.
just to see how the data distribution looks
# density plots
plot(density(data$LO))
plot(density(data$QNT2))
plot(density(data$QNT3))
plot(density(data$QNT4))
plot(density(data$HI))
### Plot SML, as predicted by the CAPM
# plot for part f
plot1 <- data %>% ggplot( aes(y = (LO-RF), x = RP*12))+geom_point(col='yellow4') + xlab('Market Excess Return') + ylab('Portfolio Excess Return') + ggtitle('Does CAPM hold for every portfolio?') + geom_smooth(method='lm')+geom_point(aes(y = (QNT2-RF)),col='yellow3')+geom_point(aes(y=(QNT3-RF)),col='yellow2')+geom_point(aes(y=(QNT4-RF)),col='yellow1')+geom_point(aes(y=(HI-RF)),col='lightyellow')+theme_classic()
plot1 <- ggplotly(plot1)
plot1
On a new graph, plot the CAPM-predicted risk premium (y-axis) vs. the realized excess return (x-axis) for each portfolio. Note that (i) the CAPM-predicted risk premium is given by beta times average market excess return and (ii) the realized risk premium is equal to average excess return. Use annualized returns for easier interpre- tation. What should the plot look like if the CAPM holds? What can you say about the validity of the CAPM from this graph? [Hint: plot the 45-degree line on the same graph.]
CAPM <- data %>% ggplot( aes(y = (reg1$coefficients[2]*mean(RP*12)), x = mean(LO)))+geom_point(col='yellow4')+geom_abline(intercept = 0,slope = 1,color='red',linetype='dashed',size=1.5)+ xlab('Realised Excess Return') + ylab('CAPM - predicted risk premium') + ggtitle('Does CAPM hold for every portfolio?') +geom_point(aes(y = (reg2$coefficients[2]*mean(RP*12)), x = mean(QNT2)),col='yellow3')+geom_point(aes(y = (reg3$coefficients[2]*mean(RP*12)), x = mean(QNT3)),col='yellow2')+geom_point(aes(y = (reg4$coefficients[2]*mean(RP*12)), x = mean(QNT4)),col='yellow1')+geom_point(aes(y = (reg5$coefficients[2]*mean(RP*12)), x = mean(LO)),col='yellow')
CAPM <- ggplotly(CAPM)
CAPM
# PARTB initial data transformation
# expected values
E_LO = mean(data$LO)
E_HI = mean(data$HI)
# risk
sd_LO = sd(data$LO)
sd_HI = sd(data$HI)
# correlation
HILO = cor(data$LO,data$HI)
#table
answer = tibble(E_LO,E_HI,sd_LO,sd_HI,HILO)
answer
## # A tibble: 1 x 5
## E_LO E_HI sd_LO sd_HI HILO
## <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 0.730 1.03 5.71 4.48 0.846
annualised_er = answer[1:2]*12
annualised_sd = answer[3:4]*sqrt(12)
annualised_er
## E_LO E_HI
## 1 8.7544 12.3698
annualised_sd
## sd_LO sd_HI
## 1 19.76415 15.51988
#correlation
HILO
## [1] 0.8462735
# Sharpe ratio calculation
SharpeRatio_LO = annualised_er/annualised_sd
SharpeRatio_LO = rename(SharpeRatio_LO, SR_LO = E_LO,SR_HI = E_HI)
SharpeRatio_LO
## SR_LO SR_HI
## 1 0.4429434 0.7970295
As can be seen from the output, HI provides higher return for the given risk.
# creating weights
weights <-seq(from =-1.5,to = 1.5,length.out=30)
# data table with the weights
df_w <- tibble(wLO = weights,
wHI = (1-weights))
# table with er and sd for the given wieghts
B_df <- df_w %>% mutate(er_P = wLO*E_LO*12 + wHI*E_HI*12,sd_P = sqrt((wLO*sd_LO*sqrt(12))^2 + (wHI*sd_HI*sqrt(12))^2 + 2*wLO*wHI*HILO))
B_df
## # A tibble: 30 x 4
## wLO wHI er_P sd_P
## <dbl> <dbl> <dbl> <dbl>
## 1 -1.5 2.5 17.8 48.8
## 2 -1.40 2.40 17.4 46.3
## 3 -1.29 2.29 17.0 43.8
## 4 -1.19 2.19 16.7 41.3
## 5 -1.09 2.09 16.3 38.8
## 6 -0.983 1.98 15.9 36.3
## 7 -0.879 1.88 15.5 33.9
## 8 -0.776 1.78 15.2 31.5
## 9 -0.672 1.67 14.8 29.1
## 10 -0.569 1.57 14.4 26.8
## # … with 20 more rows
# top 6 entries maximum Portfolio SD
head(arrange(B_df,desc(sd_P)))
## # A tibble: 6 x 4
## wLO wHI er_P sd_P
## <dbl> <dbl> <dbl> <dbl>
## 1 -1.5 2.5 17.8 48.8
## 2 -1.40 2.40 17.4 46.3
## 3 -1.29 2.29 17.0 43.8
## 4 -1.19 2.19 16.7 41.3
## 5 -1.09 2.09 16.3 38.8
## 6 -0.983 1.98 15.9 36.3
recheck part d
#weights
cp_weights <-seq(from =0,to = 1,length.out=30)
# data table with weights
df_cp <- tibble(wRF = cp_weights,
wP = (1-cp_weights))
# table with cp
cp_df <- B_df %>% mutate(er_CP = df_cp$wRF*0.03 + df_cp$wP*er_P,sd_CP = (df_cp$wP*sd_P))
cp_df
## # A tibble: 30 x 6
## wLO wHI er_P sd_P er_CP sd_CP
## <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 -1.5 2.5 17.8 48.8 17.8 48.8
## 2 -1.40 2.40 17.4 46.3 16.8 44.7
## 3 -1.29 2.29 17.0 43.8 15.9 40.7
## 4 -1.19 2.19 16.7 41.3 14.9 37.0
## 5 -1.09 2.09 16.3 38.8 14.1 33.4
## 6 -0.983 1.98 15.9 36.3 13.2 30.1
## 7 -0.879 1.88 15.5 33.9 12.3 26.9
## 8 -0.776 1.78 15.2 31.5 11.5 23.9
## 9 -0.672 1.67 14.8 29.1 10.7 21.1
## 10 -0.569 1.57 14.4 26.8 9.96 18.5
## # … with 20 more rows
# plot
ggplot()+geom_point(data = B_df,aes(y=er_P,x=sd_P))+geom_point(data = tibble(sd =c(sd_LO*sqrt(12),sd_HI*sqrt(12)),mean = c(E_LO*12,E_HI*12)),aes(x=sd,y=mean),color = "red",size=3,shape=18)+ggtitle("Portfolio returns",subtitle = "with LO and HI")+xlab("standard deviation")+ylab("expected returns")+scale_y_continuous()+scale_x_continuous()+geom_hline(yintercept = 3)+ylim(0,15)+geom_line(data = cp_df,aes(y=er_CP,x=sd_CP),color = 'red')
## Scale for 'y' is already present. Adding another scale for 'y', which
## will replace the existing scale.
## Warning: Removed 8 rows containing missing values (geom_point).
## Warning: Removed 3 rows containing missing values (geom_path).
### tangent portfolio weights
numerator = ((mean(E_LO*12)-3)*sd_HI -(mean(E_HI*12)-3))*HILO
denominator = (mean(E_LO*12)-3)*sd_HI +(mean(E_HI*12)-0.03)*sd_LO-(mean(E_HI*12)-3+mean(E_LO*12)-3)*HILO
cw_LO = numerator/denominator
cw_LO
## [1] 0.1665551
cw_HI = 1-cw_LO
# lets compute the er and sd
er_cp = cw_LO*E_LO*12 + cw_HI*E_HI*12
sd_cp = sqrt((cw_LO*sd_LO*sqrt(12))^2 + (cw_HI*sd_HI*sqrt(12))^2 + 2*cw_LO*cw_HI*HILO)
# sharpe ratio
sharpe_cp = (er_cp - 3)/sd_cp
#output
er_cp
## [1] 11.76764
sd_cp
## [1] 13.35606
sharpe_cp
## [1] 0.6564539
now that we have formed the tangent line, and figured out the tangent portfolio. now we will try to find the complete portfolio, with 12% expected return
# expected returns and the sd for the tangent portfolio
e_tp = df_cp$wP*er_cp + df_cp$wRF*3
sd_tp = df_cp$wP*sd_cp
final_df = tibble(e_tp,sd_tp)
max(final_df$e_tp)
## [1] 11.76764
doubt dont know how to exactly get the last answer, according to my tangent portfolio return, my max is 11.76764, hence getting 12 is impossible, close to 12, which is 11.76764 i am getting 0% weight on the rf asset, hence providing me with the tangent portfolio with no shorting capabiities. Plot below
test <- ggplot()+geom_point(data = B_df,aes(y=er_P,x=sd_P))+geom_point(data = tibble(sd =c(sd_LO*sqrt(12),sd_HI*sqrt(12)),mean = c(E_LO*12,E_HI*12)),aes(x=sd,y=mean),color = "red",size=3,shape=18)+ggtitle("Portfolio returns",subtitle = "with LO and HI")+xlab("standard deviation")+ylab("expected returns")+scale_y_continuous()+scale_x_continuous()+geom_hline(yintercept = 3)+geom_line(data = final_df,aes(y=e_tp,x=sd_tp),color='blue')
test <- ggplotly(test)
test